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Abstract Two degenerate flavours of quarks are simulated with small masses down to about one fifth of 
the strange quark mass by using the two-step multi-boson (TSMB) algorithm. The lattice size is 8 3 • 16 
with lattice spacing about a ~ 0.27 fm which is not far from the Nt = 4 thermodynamical cross-over 
line. Autocorrelations of different physical quantities are estimated as a function of the quark mass. The 
eigenvalue spectra of the Wilson-Dirac operator are investigated. 



1 Introduction 

The question about the computational cost of dynamical 
quark simulations is a central issue in lattice gauge the- 
ory. Existing unquenched simulations are typically done 
in a region where the quarks are not light enough, in most 
cases - especially in case of Wilson-type quarks - with 
two light quark flavours (it and d) having masses larger 
than half of the strange quark mass (m u d > |m s ). The 
physical masses of u- and d-quarks are so small that in 
the foreseeable future simulations can only be carried out 
at somewhat higher masses. In order to extrapolate the 
^ . results to the physical masses chiral perturbation theory 
$_i ' based on the low energy chiral effective Lagrangian can 
^ - be used. However the systematic errors can only be con- 
trolled if the dynamical quark masses in the simulations 
are close enough to the physical point. For instance, in 
case of partially quenched simulations to determine the 
low energy constants in the chiral effective Lagrangian of 
QCD we would like to reach at least ra u( i < \m s jjj. 

Going to light quark masses in unquenched QCD sim- 
ulations is a great challenge for computations because 
known algorithms have a substantial slowing down to- 
wards small quark masses. The present status has been re- 
cently summarized by the contributors to the panel discus- 
sion at the Berlin lattice conference Inspired 
by the results presented there the computational cost of a 
simulation with two light quarks will be parametrized in 
the present paper as 



C = F (romj" 



(i) 



Here r$ is a physical length, for instance the Sommer 
scale parameter Q, the pion mass, L the lattice ex- 
tension and a the lattice spacing. The powers Z n> L,a an d 
the overall constant F are empirically determined. The 



value of the constant factor F depends on the precise 
definition of "cost" |{J. For instance, one can consider 
the number of floating point operations in one autocor- 
relation length of some important quantity, or the num- 
ber of fermion-matrix-vector-multiplications necessary for 
achieving a given error of a quantity. Of course, the cost 
also depends on the particular choice of lattice action and 
of the dynamical fcrmion algorithm which should be op- 
timized. 

An alternative parametrization can be obtained from 
the one in (jl]) by replacing the powers of r^ra^ by those 
of m^/mp. hi fact, the results of the CP-PACS, JLQCD 
Collaboration have been presented by A. Ukawa at the 
Berlin lattice conference || in this form: 



(2) 



Cu = Fu ^ ' £ (1°) 



F v = 5.9- 10 6 flop 

= 6 , z L = 5 , 



■'^71 f) 



(3) 
(4) 



Since the determination of the p meson mass is difficult for 
light quarks when the decay p — > 7T7t is allowed, we prefer 
the form in (nj). Other parametrizations used for Wilson- 
type quarks ^H[01 are given under the assumption that 
z-k — z a = z aTr when in (Q) the physical length parameter 
ro disappears. 

In the present paper we report on the results of ex- 
tended test runs with the simple Wilson fermion action 
using the two-step multi-boson algorithm |lC| ] in order to 
determine the quark mass dependence of the computa- 
tional cost of dynamical Monte Carlo simulations with two 
light flavours in the region m u d > \m s . For the definition 
of the quark mass the dimensionless quantity 



(5) 



2 
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is used, which already appears in (!]). This is a possible 
definition for small quark masses because for m q — > 
the pion mass behaves as oc ^Jm q . For defining the 
value of M r which corresponds to the strange quark mass 
one can use unquenched Nf = 2 lattice data. For in- 
stance, the experimental value of the f2~ baryon mass 
m Q- = 1-672 GeV and ro = 0.5 fm give r m n - = 4.237. 
Interpolating CP-PACS results for the A baryon mass 
at their largest (3 value (3 = 2.20 between k = 0.1363 and 
k = 0.1368 one can match r^niA = 4.237 if their pion 
mass is r^m^ ~ 1.76. This gives for the strange quark 
mass M r , strange — 3.1. Of course, there are also other 
ways to estimate M rtStra nge which might give slightly dif- 
ferent values. In the present paper, without attempting to 
really compute the strange quark mass, we shall stick to 
the operational definition 



M, 



r, strange 



3.1 



(6) 



The Monte Carlo simulations are done near the N t — 4 
thermodynamical cross-over line that is for a ~ 0.27 fm. 
The lattice size is 8 3 • 16 implying a physical lattice ex- 
tension L ~ 2.2 fm. Later on we shall also extend our 
investigations to 12 3 • 24 and 16 3 • 32 lattices. Our present 
studies can be considered as complementary to the ones 
on larger lattices (closer to the continuum limit) but at 
larger quark masses (typically m«j > ^m s ) 

In addition to obtaining estimates of autocorrelation 
lengths as a function of the quark mass we also performed 
a detailed study of the small eigenvalue spectra both for 
the hermitean and non-hermitean Wilson-Dirac fermion 
matrix. Besides giving important qualitative information 
about quark dynamics this also allows to clear the issue 
of the sign problem of the quark determinant. For an odd 
number of Wilson-type quark flavours the fermion deter- 
minant can have both signs because there might be some 
eigenvalues (of the non-hermitean fermion matrix) on the 
negative real axis. Since for importance sampling a posi- 
tive measure is required, the determinant sign can only be 
taken into account in a measurement reweighting step. A 
strongly fluctuating determinant sign is a potential dan- 
ger for the effectiveness of the Monte Carlo simulation 
because cancellations can occur resulting in an unaccept- 
able increase of statistical errors. We actually study this 
question here with two degenerate quark flavours (Nf = 2) 
where in the path integral the square of the fermion deter- 
minant appears and hence the sign is irrelevant. But our 
two quarks are much lighter than the physical s-quark. 
Therefore the statistical insignificance of negative eigen- 
values in this case hints towards the absence of the sign 
problem in the physical case of Nf = 2 + 1 quark flavours, 
when the sign of the s-quark determinant could, in prin- 
ciple, cause a problem. 

The plan of this paper is as follows: in the next sec- 
tion we shortly introduce the parameters of the TSMB 
algorithm and give some details of our implementation on 
different computers. In section ^| the autocorrelations are 
investigated for some basic quantities such as the average 
plaquette and the pion mass. Section || contains a detailed 
study of the small eigenvalue spectra of the fermion ma- 



trix. The last section is devoted to discussion and conclu- 
sions. 



2 The TSMB algorithm 

We use in this study the two-step multi-boson (TSMB) 
algorithm which has been originally developed for Monte 
Carlo simulations of the supersymmetric Yang-Mills the- 
ory [jlFJ but it can also be applied more generally ]12]| . 



2.1 Algorithmic parameters 

TSMB is based on a representation of the fermion deter- 
minant in the form 



|det(Q)| 



N, 



1 



detpW(Q 2 )detP^(Q 



(2), 



(7) 



Here Nf denotes the number of fermion flavours and Q is 
the fermion matrix, which in the present paper is equal to 
the Wilson-Dirac matrix 



Qys,xr — 3yx$sr 



(8) 



with x, y denoting lattice sites, r, s colour (triplet) indices, 
p, the unit lattice vector in direction fx, U xu £ SU(3) gauge 
link matrices and k the hopping parameter. The hermitean 
Wilson-Dirac fermion matrix is defined as usual by 

Q = mQ = Q f • (9) 

The polynomial approximations in (Q) satisfy 
/*)(*)**-"//*, 
Hm P<$(x)pW(x) = x~ N f' 2 , xe[e, A] (10) 

n2^oo 

where the interval [e, A] covers the spectrum of the squared 
hermitean fermion matrix Q 2 on a typical gauge configura- 
tion. The first polynomial PW is a crude approximation 
with relatively low order. It is used in the multi-boson 
representation of fermion determinants |Tj|. The second 
polynomial P^ is a correction factor which is taken into 
account in the gauge field updating by a global accept- 
reject step. For this a polynomial approximation of the 
inverse square root of P^ is also needed: 



p(f( x )^p£( x y 



(ii) 



The limit 712 — > oo can be taken in the computed expecta- 
tion values if one produces several update sequences with 
increasing ri2 or, more conveniently, one can keep ni fixed 
at some sufficiently large value for a good approximation 
and introduce a further polynomial P*- 4 -* satisfying 

lim P^{x)PS}(x)P^{x) = x- N f/ 2 . (12) 
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P^ 4 ) can be taken into account by reweighting the gauge 
configurations during the evaluation of expectation val- 

(2) 

ues. In most cases the order n% of Pn 2 can be chosen high 
enough such that the reweighting correction has a negli- 
gible effect on expectation values. In any case the evalua- 
tion of the reweighting factors is useful because it shows 
whether or not the two-step approximation in (|l0|) is good 
enough. For a recent summary of some details of TSMB 
and for references see section 3 of Q . 

The Monte Carlo integration of the path integral is 
performed by averaging over a sequence (Markov chain) of 
multi-boson and gauge field configurations. The ni multi- 
boson fields (<P) and gauge fields (U) are updated in re- 
peated update cycles consisting of several sweeps over the 
multi-boson fields and gauge field. For the multi-boson 
fields we use (local) heatbath and overrelaxation as well 
as global quasi-heatbath |l5| sweeps. For the gauge field 
update heatbath and overrelaxation sweeps are alternated. 
After several gauge field sweeps a global Metropolis accept- 
reject correction step is performed by the polynomials P^ 2 ' 
and P< 3 ). The update sequence within a cycle is subject 
to optimization with the goal to decrease autocorrelations. 
We tried several kinds of update sequences within an up- 
date cycle. A typical sequence was: 3 <£-overrelaxations, 1 
<£-heatbath, 12 [/-overrelaxation, global [/-Metropolis, 3 
<£-overrelaxations, 1 <£-heatbath, 6 [/-heatbath, global U- 
Metropolis. In every 10-th cycle the first ^-overrelaxation- 
<£-heatbath combination was replaced by a global quasi- 
heatbath. 

2.2 Implementation and performance 

We have implementations of the updating and measure- 
ment programs in TAOmille for the APEmillc and in C++/ 
MPI. The latter implementation is usable on many differ- 
ent architectures as long as they provide a C+- 1- com- 
piler and, in case of parallel computers, support MPI. In 
the updating program the computing time is dominated 
by the fermion-matrix- vector-multiplications (MVMs), 2- 
(n 2 + n 3 ) of them are needed for the correction step and 
0(100 • n%) for the global heatbath and quasi-heatbath 
p5| . Altogether they make up 60% — 80% of the comput- 
ing time. In the most interesting regions of small quark 
masses the program is dominated by the MVMs even more 
strongly. The same is true for the measurement program, 
where smearing and calculation of simple Wilson loops 
takes only a few percent of the time. It is therefore of 
the utmost importance to improve the performance of 
the MVM routines, both preconditioned (for the correc- 
tion step and the measurements) and non-preconditioned 
(for the global heatbath). This has been done for the 
APEmille, the Cray T3E with the KAI C++ compiler, 
and for a multi-node Pentium-4 cluster here also exploit- 
ing the possibilities of SSE and SSE2 instructions. Re- 
sults are given in tabic [l]. Note that an important fea- 
ture of the SSE instructions is that in single precision the 
peak performance is doubled compared to double preci- 
sion. The performance numbers in table ^ are substan- 
tially influenced by the communication costs among corn- 



Table 1. Performance of the matrix- vector- multiplication in 
MFlops and percent relative to peak performance on one board 
(8 nodes) on the APEmille and on 8 processors on the T3E and 
P4-cluster for a 8 3 ■ 16 lattice. 





APEmille 


T3E-1200 


P4-1700 


32 bit 


1008 (23.9%) 


912 (9.5%) 


4322 (7.9%) 


64 bit 




712 (7.4%) 


2087 (7.7%) 



puting nodes. Without communications the numbers both 
for APEmille and P4-cluster would be almost a factor 
of two higher. On larger volumes than those considered 
here communication will have less influence on the perfor- 
mance. 

Since the matrix multiplications dominate the comput- 
ing time it is reasonable to express e.g. autocorrelations in 
units of MVMs. The remaining part of the computation is 
given by the local updates. These are composed of parts 
which can be essentially thought of as pieces of MVMs, 
too. As a result the following approximate formula for the 
total amount of MVMs needed for one update cycle is 
obtained: 

AWM/cycle ~ 6 (n 1 N< t , + N u ) + 2 {n 2 + n z )N c + I G F G . 

(13) 

Here N$ is the number of local bosonic sweeps per update 
cycle, Nu the number of local gauge sweeps, Nq the num- 
ber global Metropolis accept-reject correction steps, and 
Iq and Fq give the number of MVMs and frequency of 
the global heatbath. 

For data from APEmillc and Cray the estimate of the 
cost of the local updates obtained from (|l3|) agrees with 
the actual costs up to 5%. Therefore the final costs in units 
of MVM based on ([II]) are not much influenced by the ap- 
proximation. This is not true for the data presented for the 
P4-1700 system, since in this case the matrix multiplica- 
tion and the local updates are not treated homogeneously. 
Indeed the former is written in assembler using SSE/SSE2 
instructions while our code for the local updates is written 
in C++ and compiled with the g++ compiler. As a result, 
the estimate for the cost of the local updates is in this case 
underestimated by about a factor three. Still we take the 
above formula as a reference when tuning the parameters 
because the number of MVMs is more generally applicable 
as it does not depend on implementation details. In ad- 
dition, in the future the local updates could be rewritten 
by using SSE/SSE2 instructions, too, so eliminating the 
non-homogeneity with the MVMs. 

It is sometimes interesting to convert the number of 
MVMs into the number of floating point operations. On 
our 8 3 • 16 lattice this conversion is approximately 

1 MVM ~ 1.1 • 10 7 flop . (14) 

3 Autocorrelations at small quark masses 

The bare parameters of the QCD lattice action with Wil- 
son quarks {(3 for the SU(3) gauge coupling and k for the 
hopping parameter of two degenerate quarks) have to be 
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Table 2. Bare couplings, parameters of the TSMB algorithm as defined in section 2.1 and total statistics in 1000 update cycles 
(Uk) of our runs. 



run 




K 


m 


n 2 


ri3 


724 


A 


e 


u k 


(a) 


5.28 


0.160 


20 


40 


70 


100 


2.8 


1.75 -10" 2 


80 


(b) 


5.04 


0.174 


28 


90 


120 


150 


3.0 


3.75 -10- 5 


33 


(c) 


4.84 


0.186 


38 


190 


240 


300 


3.6 


1.44 -10 -3 


31 


(d) 


4.80 


0.188 


44 


240 


300 


300 


3.6 


7.2 -10 -4 


12 


(e) 


4.76 


0.190 


44 


360 


380 


500 


3.6 


2.7 -10~ 4 


144 


(/) 


4.80 


0.190 


44 


360 


380 


500 


3.6 


2.7 -lO" 4 


224 


G?) 


4.72 


0.193 


52 


600 


750 


800 


3.6 


0.9 -10~ 4 


196 


(h) 


4.68 


0.195 


66 


900 


1200 


1100 


3.6 


3.6 -lO"" 


200 


(<) 


4.64 


0.197 


72 


1200 


1500 


1400 


3.6 


1.8 -10" b 


110 


(J) 


4.64 


0.1975 


72 


1200 


1350 


1400 


4.0 


2.0 -10~ b 


4 



tuned properly in order to obtain the desired parameters 
in the Monte Carlo simulations. We are interested in the 
quark mass dependence of the simulation cost of hadron 
spectroscopy applications, therefore we want to keep the 
physical volume of our lattices sufficiently large and (ap- 
proximately) constant. For a 8 3 • 16 lattice, a lattice spac- 
ing a ~ 0.27 fm implies a lattice extension of L ~ 2.2 fm 
which is a reasonable starting point for spectroscopy. Pre- 
vious Monte Carlo simulations with Nf — 2 Wilson quarks 
]l6| , flT] showed that this kind of lattice spacing is realized 
near the Nt = 4 and N t = 6 thermodynamical transi- 
tion lines which, therefore, provide a good orientation. We 
started our simulations at a relatively large quark mass 
on the Nt = 4 transition line and then tuned f3 and k to- 
wards smaller quark masses keeping r$/a approximately 
constant. A summary of simulation points is given in ta- 
ble [2] where some important algorithmic parameters of the 
TSMB are also collected. 

Most of the runs have been done with 32-bit arith- 
metics. Exceptions are run (j) and about 10% of the statis- 
tics in run (h) where 64-bit arithmetics was used. In gen- 
eral, on the 8 3 • 16 lattice it is not expected that single 
precision makes any difference. In fact, the double preci- 
sion results in run (h) were compatible within errors with 
the single precision ones. 



3.1 Physical quantities 

In order to monitor lattice spacing and quark mass one 
has to determine some physical quantities containing the 
necessary information. As discussed before, we define the 
physical distance scale from the value of the Sommer scale 
parameter rQ. Once tq in lattice units is known one can 
transform any dimensionful quantity, for instance the pion 
mass m- 7Tl from lattice to physical units. Therefore a care- 
ful determination of tq /a is important. For a dimensionless 
quark mass parameter one can use M r as defined in (||): 
M r = (ro/a ■ am^) 2 . In addition, we also measured some 
other quantities like m p and another definition of the 
quark mass m q for obtaining a broader basis for orienta- 
tion. In the next subsections the procedures for extracting 
these quantities will be described in detail. 



3.1.1 Masses and amplitudes 

In order to extract masses and amplitudes we compute 
the zero-momentum two-point functions depending on the 
time-slice distance (xq — yo) 



C XY (x -y ) = yJ2( Xt W Y ^ 



(15) 



with 


X = (x 


x) and 








X(x) 


= Y(x) 


= P 5 (x) 


= q'{x)^q{x) 


(Cp P (x - 


yo)), 


X(x) 


= Y(x) 


= A (x) 


= q'(x)^ 5 -y Q q(x) 


{Caa(xo - 


- yo)), 


X{x) 


= Y(x) 


= Vi(x) 


= q'{x)^ 5 ^q(x) 


(C ViVi (x 


-yo)) 



we also consider the mixed correlator with 



X(x)=A {x), Y(x)=P 5 (x) 



(C A p(x ~y )) 



Exploiting translation invariance we pick the source y in 
( |l5|) at random over the lattice. Taking into account cor- 
relations between different time-slices, one sees that this 
procedure is optimal for the ratio computational cost / 
final statistical error for hadronic observables. 

Masses and amplitudes are in general obtained from 
the asymptotic behaviour of the correlators]] 



e 



2m 



V Y ( e -m p T + ^l)X+Y e -ra p{ L t -T) ){l&) 

(17) 



Cxy(T) 

t X Y = V(o|*((%)(o|y(o)|p> , 



where \p) is the zero-momentum state of the particle as- 
sociated with the operators X(x) and Y(x), m p the corre- 
sponding mass and (— l) x ( y ) the time-parity of X(Y)(x). 
We determine parameters m p and {;xy by global fitting 
over a range of time-slice distances (after time- symmetriza- 
tion) T G [T m i n , L t /2]. We find the optimal value for 
Tmin by checking the behaviour of the effective local mass 



Amplitudes are assumed to be real. 
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e ff(T). The latter is implicitly defined by the relation 



Cxy{T) 
Cxy(T+1) 



(18) 



e -m eff (T)T + (-1)X+Y e -m eff (T) (L t -T) 
e -m aff (T)(T+1) + (_ \}X+Y e -m c// (T) (L t -T- 1) ' 

The value of T m i n is fixed by the onset of the plateau 
for m e ff(T) as a function of T. The plateau- value for the 
effective mass is always consistent with the result from the 
global fit procedure. The latter gives however the most 
precise determination. 

A typical problem associated with small quark masses 
is a delayed asymptotic behaviour for correlators (i.e. a 
larger T m i n ) resulting in large errors for the hadronic ob- 
servables. This problem was solved by applying Jacobi 
smearing Jl8| on both source and sink. Jacobi smearing 
was applied in a different context [|l9|,|2(| in the same situ- 
ation of light fermionic degrees of freedom, and it showed 
to improve the overlap of the hadronic operators with the 
bound-state. Amplitudes and decay constants have been 
determined from correlators with local operators. 

We determine the pion mass from the asymptotic 
behaviour of the correlator Cpp(T). From Cpp{T) one 
can also extract the amplitude = (0|P5(0)|7r) by identi- 
fying g-n = £pp. The p meson mass m p is determined from 
the asymptotic behaviour of the correlator 



1 

Cvv(T) — - ^2 Cvi 



(19) 



i=l 



For the determination of the pion decay constant f n = 
m" 1 (0|Ao(0)|tt) we apply two different methods. In the 
first, the amplitude (0|A (0)|7r) is obtained by fitting the 
asymptotic behaviour of the correlator Caa (T) while the 
pion mass is the one coming from Cpp(T). In the second 
method pM, we fit the amplitude-ratio 



tap 



(Ql^o(Q)k) 

<0|P 5 (0)k) 
by using the asymptotic behaviour 



C A p(T) 
C PP (T) 



= r A p tanh[m 7r (L i /2 - T)] 



(20) 



(21) 



where is fixed at the best-fit value from Cpp{T). The 
determination of / w is then obtained from the relation 



U = m^rApg* 



(22) 



using for g^ the determination from Cpp (T) . In the region 
of large and moderate quark masses the second method 
gives by far the most precise determination of f v . This is 
generally no more true for very light quarks where data are 
highly correlated. Here the best determination was picked 
from the two different methods on a case-by-case basis. 

Using the above determinations we can extract the 
quark mass defined by the PC AC relation 



rn 



PCAC 



A 

2g n 



(23) 



The PCAC quark mass gives us a second definition of the 
physical quark mass alternative to (Em) 



PCAC 



(24) 



We estimated statistical errors on hadron quantities 
by applying the Jackknife procedure on blocks of data of 
increasing size. The same procedure is applied also for 
the Sommer scale parameter (see next subsection). This 
method provides us with a definition of the integrated au- 
tocorrelation Ti n t of the pion mass. Au tocorrelations in 
general will be discussed in section ( |3.2|) . The results for 
the hadronic quantities are listed in table |[ 



3.1.2 Sommer scale parameter 

There are several phenomenological models that can be 
used to get an estimate for the Sommer scale parameter 
rrj in nature, and most of them point towards a value of 
ro ~ 0.49fm. On the lattice r^/a can be calculated from 
the static quark potential, which is in turn determined 
from Wilson loops. The basic idea is simple, but since we 
want to match all our results to this parameter it is crucial 
to get a precise determination. To achieve this we follow 
the method proposed b y M ichael and collaborators [^2| 
p3| and some details in p4| . 

Using the variational approach of [S we get matrices 
Wij(r,t) consisting ofrxt loops of smeared gauge links, 
where our smearing technique of choice is APE-smearing 
(indices i,j label the level of smearing). We use for our 
determinations two and six or two, four and six levels of 
smearing and symmetrize the matrices Wij . The ratio sta- 
ple/link is set to a — 0.45. 

From the solutions to 



W^(r,t)0(r)f> = A< fc >(r;t,to)Wy(r,io)0(r)$ fc) , 
i,j,fc=0,l(,2) 



(25) 



one gets the eigenvector 0(r)j°' ) for the largest eigenvalue 

A(°)(r; t = trj + l> *())• This equation is solved by transform- 
ing it into an ordinary eigenvalue equation, where several 
ways are possible: 

W(r,t )- 1 W(r,t)4> = \<P (26) 
W(r,t)W(r,to) -1 (W(r,to)0) = \(W(r,t )& (27) 
W(t, t )- 1/2 W(r, t)W(r, t a y^ 2 (W(r, h) 1 ' 2 ^) 
= \(W(vM) 1/2 <P) ■ (28) 

In the literature |25| the third version has been used. How- 
ever this can onlybe done with extremely good statistics. 
Otherwise it is possible that, due to statistical fluctua- 
tions, the matrix Wij gets negative eigenvalues making 
the (real) square root impossible. We checked that the 
first two versions give numerically exactly the same re- 
sult. For the final determinations we choose the first ver- 
sion (E9), where one has to be careful about the fact that 
W(r, to) _1 VF(r, t) has no longer to be symmetric, compli- 
cating the calculation of the corresponding eigenvectors. 
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Once the eigenvector 4>( r ) j nas been obtained, we can 
project the matrix Wy to the ground state: 

W o (r,t) = 0(r)fV ij (r,t)0(r)f ) . (29) 

This correlator leads to good estimates of the ground state 
energy 

W (T,t) \ 



E (r,t)=]n 



W (r,t+1) 



(30) 



The potential V(r) is estimated by averaging Eo(r,t)/t 
over time extensions t with t > 1 and weight given by the 
Jackknife error. Compared to some other methods this 
way of extracting the potential seems to give the most 
reliable estimates with smallest error bars. 

The Sommer scale parameter is defined in terms of the 
potential as 



ay_ 

dr 



1.65 



(31) 



Having a reliable static quark potential we can follow [£6| 
by fitting the potential to 



V(r) =V + , 



4" 



(32) 



with r — |r| and [i] being the tree-level lattice Coulomb 
term 

r L r d 3 k cos(k-r) 



-(2^) 3 4E =1 sin 2 (fc,/2) 



(33) 



Due to the small lattice size we had to drop in ( p2| ) the 
additional correction term / • ([^] — i), which could have 
been used to estimate 0(a) effects, fixing e = 7r/12. Bring- 
ing together the above equations we extract rg from 



1.65 - e 



ro = Jt^—Z. (34) 



3.2 Autocorrelations 

The "cost" of numerical simulations can be expressed in 
terms of the necessary number of arithmetic operations for 
obtaining during the Monte Carlo update process a new 
"independent" gauge field configuration. The real cost can 
be then easily calculated once the price of e.g. a floating 
point operation is known. For a definition of the indepen- 
dence of a new configuration the integrated autocorrelation 
Ti n t is used. (For a general reference see f27j.) T int does de- 
pend on the particular quantity it refers to. Of course, it 
is reasonable to choose an "important" quantity as, for 
instance, the pion mass but simple averages characteriz- 
ing the gauge field such as the plaquette average are also 
often considered. 

In case of the TSMB algorithm a peculiar feature is 
the reweighting step correcting for the imperfection of 
polynomial approximations. As will be discussed in the 
next subsection, in most of our runs this correction is to- 
tally negligible but even in these cases it is important to 



perform the reweighting on a small subsample of config- 
urations in order to check that the used polynomials are 
precise enough. In some cases, especially for very small 
quark masses, there are a few exceptional configurations 
with small eigenvalues of the squared hermitean fermion 
matrix (Q 2 ) which are practically removed from statistical 
averages by their small reweighting factors. In the calcula- 
tion of expectation values these reweighting factors were 
always taken into account. For the autocorrelations the 
effect of the exceptional configurations is in most cases 
negligible. 



3.2.1 Integrated autocorrelation of the pion mass 

In case of secondary quantities such as the pion mass, or 
in general any function of the primary expectation values, 
the straightforward definition of the integrated autocorre- 
lation Ti n t for primary quantities is not directly applicable. 
In fact, there are several possibilities which we shall now 
shortly discuss. 

Linearization: as it has been proposed by the ALPHA 
Collaboration ^ , in the limit of high enough statistics the 
problem of the error estimate and of the autocorrelation 
for secondary quantities can be reduced to considering a 
linear combination of primary quantities. Let us denote 
the expectation values of a set of primary quantities by 
A a , (a — 1,2,...). Their estimates obtained from a data 
sequence are a a . For high statistics the estimates are al- 
ready close to the true values: \a a — A a \ -C 1. Therefore, 
if the secondary quantity is defined by a function f(A) of 
primary quantities, we have 



f(s)-f(A)^J2(^-A c 



MA) 
dA n 



(35) 



The values of the derivatives are constants therefore on the 
right hand side there is a linear combination of primary 
quantities which can be handled in the same way as the 
primary quantities themselves. Since 



df(A) _ df(A) 
dA a dA a 



fa 



one can consider the linear combinations 



.4 



y A a f a 



^ Qotfot 



(36) 



(37) 



and the variance of the secondary quantity can be esti- 
mated as 

ajc((a f -A f f) . (38) 

(Note that here (. . .) stands for the expectation value in 
an infinite sequence of identical measurements with the 
same statistics as the one under consideration.) According 
to (|3|) the integrated autocorrelation of the secondary 
quantity can be defined as the integrated autocorrelation 

° iA f\ 

This way of obtaining error estimates and autocorre- 
lations of secondary quantities is simple and generally ap- 
plicable. Let us note that because of the reweighting even 
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the simplest physical quantities are given by ratios of two 
expectation values and are, therefore, secondary quanti- 
ties. 

Blocking: in case of sufficiently large statistical sam- 
ples the integrated autocorrelations of secondary quanti- 
ties can also be obtained by comparing statistical fluc- 
tuations of data coming from the measurement program 
before and after a blocking procedure. The blocking pro- 
cedure eliminates for increasing block size the autocor- 
relations between data and the final error is the one for 
uncorrelated data. Since the latter is the true error of the 
measurement, it is appropriate to use this definition of 
Ti n t to estimate the real cost of a simulation. In the case 
of primary quantities such as the plaquette this definition 
coincides with the usual one. 

For a generic quantity A one can define ujf (A) as the 
standard deviation of the data at blocking-level n. In the 
case of the pion mass, we determine this quantity by ap- 
plying the jackknife procedure on the hadron correlators 
averaged over blocks of length n. In the limit of infinite 
statistics, for increasing n, cr®(A) should approach after a 
transient an asymptotic value corresponding to the stan- 
dard deviation of the uncorrelated data <r unc (A). For fi- 
nite statistics, cr^(A) fluctuates around a unc (A). We de- 
termine a unc (A) by averaging <j„ (A) over a range of block 
sizes n after the transient. The error on this determination 
is given by the mean dispersion of data around the aver- 
age. Once o~ unc (A) is given the integrated autocorrelation 
is defined as 

1 (<Junc{AY " 



history of smallest eigenvalue on different replica of run (i) 



Tint 



** (A) 



(39) 



Another way of writing the above formula is 



2 Til 



(40) 



where N stat is the original statistics and N unc is the num- 
ber of uncorrelated configurations; so 2rj„t can be thought 
of as the distance between two uncorrelated configura- 
tions. 

Covariance matrix: in most cases it is a good approxi- 
mation to assume that the probability distribution of the 
estimates a a of the primary quantities A a is Gaussian: 

P(a) oc exp i ~ ^2{a a - A a ) C~^> (««' - A a i) > 

L. aa' ) 



The covariance matrix is 



(41) 



o.ooo l 



le-05 



le-06 



O.OOO 1 



le-07 - 



0.0001 



le-05 



le-06 



le-07 




0.0001 



le-07 



5000 10000 15000 20000 25000 30000 



Figure 2. Histories of the smallest eigenvalue at /3 = 4.64, 
k = 0.197 for two independent lattices. The upper figure shows 
the typical case when the smallest eigenvalue stays most of the 
time above e shown by the dashed line. The lower figure is the 
history with exceptionally small eigenvalues. The measurement 
of physical quantities was started at the vertical line. 



of a a by generating a large number of estimates. From the 
error it is also possible to obtain an indirect estimate of 
the integrated autocorrelation from a formula like (|39|). 

The integrated autocorrelation of the pion mass (and 
the error of the pion mass) can be obtained by any of 
these three methods and the results are generally consis- 
tent with each other. The method based on linearization is 
rather robust already at the level of statistics we typically 
have. The blocking method becomes easier unstable, espe- 
cially for moderate statistics. This is understandable since 
the statistics in the individual blocks is reduced compared 
to the total sample. The method based on the covariance 
matrix needs sufficient statistics in order that the esti- 
mate of the covariance matrix be reliable. This is usually 
the case for effective masses derived from intermediate dis- 
tances but - in our runs - this method sometimes fails for 
the largest distances. 



((a a — A a )(a a < - A a i)) = (a a a a >) - (a a ){a a ') = C aa > 



The elements of the covariance matrix can be estimated 
from the data sequence by determining the integrated au- 



(42) 3.2.2 Correction factors 



to correlations r. 



2r 



{A a A a ,) 



Nstat 



Once the probability distribution of the estimates P(a) is 
known one can obtain an error estimate for any function 



As has been described in subsection 2.1 a fourth polyno 



mial Pn 4 J can be used to extrapolate to infinite polynomial 
order therefore avoiding the need for several simulations 

( 2) 

(43) with different orders of the second polynomial P„ 2 . In our 
runs the evaluation of the reweighting factors was done in 
the way described in detail in Few smallest eigen- 
values A 2 of the squared hermitean fermion matrix Q 2 , 
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Correction factor histogram for run (h) 



Correction factor histogram for run (i; 



0.5 1 1.5 

correction factor 




correction factor 



Figure 1. Correction factors for run (h) (left panel) and (i) (right panel). 



typically four, were explicitly determined and the corre- 
sponding correction factors were exactly taken into ac- 
count. In the subspace orthogonal to the corresponding 
eigenvectors a stochastic estimate based on four Gaussian 
random vectors was taken. (Note that in the limit of infi- 
nite statistics a stochastic estimate always gives a correct 
result independently of the number of random vectors, no 
systematic error is introduced.) 

As stated before, most of our results were obtained 

(2) 

with second polynomials Pn 2 which gave already a good 
approximation of the fermionic measure. In this case the 
inclusion of the correction factors had nearly no influence 
on the final determinations since they were very close to 
one. Going to smaller quark masses the smallest eigenvalue 
starts to fluctuate more, and it is therefore no longer rea- 
sonable to try to use a second polynomial that is good 
enough for all cases. Such large fluctuations appeared in 
runs (h) and (i). The histograms of reweighting factors 
are illustrated by figure |l]. It turned out that, as expected, 
the inclusion of the correction factors had the nice effect 
of reducing error bars. This is especially noticeable for 
fermionic quantities and in particular the pion mass which 
is highly correlated to the smallest eigenvalue. 

Nevertheless, even in these cases the effect of reweight- 
ing was so small that the individual estimates with correc- 
tion factors agreed within error bars with those without 
correction factors. As a whole, however, a minor system- 
atic increase in the masses could be seen. Both histograms 
in figure ^ have a tail towards small values which are due 
to eigenvalues that could have been further suppressed by 
a better second polynomial. As the figure shows, this tail 
is more important in run (i) than in run (h). A closer look 
at the smallest eigenvalue histories in run (?) reveals that 
the tail near zero was produced by one of the four indepen- 
dent parallel lattices when the smallest eigenvalue stayed 
for some time below the lower limit of the approximation 
interval e (see figure . 



Configurations with small eigenvalues A 2 are interest- 
ing because exceptionally small values could indicate cross- 
ing of real eigenvalues of the Wilson-Dirac matrix Q to the 
negative axis. This could give a negative sign for the de- 
terminant of a single quark flavour. We systematically an- 
alyzed in all our runs configurations with small A 2 search- 
ing for this effect. In the present study we found, for the 
first time in a QCD simulation with TSMB, configurations 
with real negative eigenvalues of the Wilson-Dirac matrix. 
This happened namely in one run, run (i). The (three) 
configurations are however statistically insignificant, since 
the corresponding reweighting factors are extremely small: 
2.7- 1CT 2 , 8.2-10" 4 and 3.0-10" 4 . Statistically they repre- 
sent less than 0.03 configurations in a sample with a total 
statistical weight of about 1600. 



3.2.3 Results for autocorrelations 

The analysis of the runs specified in table || gives the re- 
sults for physical quantities collected in table ]3[ The inte- 
grated autocorrelations, where they could be determined, 
are given in table ||. 

The quoted errors of autocorrelations were estimated 
in different ways. For the determination of the autocor- 
relation of the pion mass we apply the blocking method 
explained in section 3.2.1. In general, one has to say that 
in some cases our statistics is only marginal for a precise 
determination of the integrated autocorrelations. In some 
cases (run (a), (j)) we are not able to quote a reliable 
result for the autocorrelation of the pion mass. 

In the high statistics runs with small quark masses 
(e), (/), (g), (h) and (i) we had four independent parallel 
update sequences which could be used for a crude estimate 
of the errors. In addition, whenever the runs were long 
enough, we used binning with increasing bin lengths for 
the error estimates. 

In general, integrated autocorrelations of the average 
plaquette are longest. Those for the smallest eigenvalue 
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Table 3. Results of runs specified in table |2| for different physical quantities defined in the text. The values given in lattice 
units can be transformed to physical units by canceling the lattice spacing a with the help of the results for ro/a and using 
r = 2.53 GeV" 1 . 



run 


ro/a 


aU 


am, 


arrip 




M r 


Mr 


(a) 


1.885(30) 


0.3738(50) 


1.2089(36) 


1.2982(32) 


0.9312(17) 


5.19(20) 


0.498(12) 


(b) 


1.715(20) 


0.4321(23) 


1.0428(41) 


1.1805(38) 


0.8834(14) 


3.20(10) 


0.305(6) 


(c) 


1.616(110) 


0.4171(47) 


0.7886(40) 


1.0251(48) 


0.7693(32) 


1.61(24) 


0.148(11) 


(d) 


1.903(159) 


0.4199(75) 


0.753(11) 


0.999(12) 


0.752(11) 


2.05(40) 


0.155(13) 


(<0 


1.697(46) 


0.4191(20) 


0.7151(20) 


0.9941(19) 


0.7187(16) 


1.473(88) 


0.1229(41) 


(/) 


1.739(33) 


0.3658(34) 


0.5825(34) 


0.9089(47) 


0.6431(33) 


1.026(51) 


0.0811(30) 


(9) 


1.772(41) 


0.3791(39) 


0.5695(38) 


0.9116(33) 


0.6256(31) 


1.018(61) 


0.0770(32) 


(h) 


1.765(37) 


0.3668(54) 


0.5088(51) 


0.8983(35) 


0.5675(42) 


0.806(50) 


0.0596(27) 


(0 


1.812(46) 


0.3575(48) 


0.4333(48) 


0.8616(80) 


0.5002(60) 


0.616(45) 


0.0429(21) 


w 


1.756(128) 


0.3377(48) 


0.4205(54) 


0.859(12) 


0.4894(65) 


0.545(47) 


0.0363(38) 



Table 4. Integrated autocorrelations in update cycles ob- 
tained from runs specified by table ^[ In the second column 
C C y C i e gives the number of kMVM's (10 3 MVM's) per update 
cycle. The suffix min, plaq, 7r8 and refer to the minimal 
eigenvalue of Q 2 , the average plaquette, the pion correlator at 
distance d — 8 and the pion mass, respectively. 



run 


C cycle 


mm 
'int 


plaq 
int 


t t8 
' int 


m-rr 

T 

171 1 


(a) 


1.49 




200(20) 






(6) 


2.45 


340(60) 


350(50) 


152(20) 


140(20) 


(c) 


4.35 




420(80) 




150(20) 


(d) 


5.05 


~ 310 


490(90) 




170(90) 


(e) 


7.34 


550(110) 


490(40) 


274(41) 


207(33) 


(/) 


7.31 


810(110) 


800(90) 


367(110) 


187(63) 


id) 


10.5 


320(80) 


820(180) 


466(62) 


188(13) 


(h) 


16.2 


380(120) 


940(330) 


370(88) 


186(40) 


(i) 


20.4 


670(210) 


1500(300) 


283(67) 


153(54) 




17.4 


~ 390 


~ 1050 







are comparable but sometimes by a factor 2-3 shorter. 
The important case of r™^ is the most favorable among 
the quantities we have considered: it is by a factor 2 to 10 
shorter than rf^ q . Our experience was that the best values 
for t™1 could be achieved in runs where the lower limit 
of the approximation interval e was at least by a factor 
of 2-3 smaller than the typical smallest eigenvalue of Q 2 
and the multi-boson fields were relatively often updated 
by global quasi heatbath. 

Using the values given in table [i] one can extract, for 
instance, the behaviour of rf^ q as a function of the di- 
mensionless quark mass parameter M r . Since, according 
to table 0, the different runs are at slightly different val- 
ues of ro/a one can correct the points with an assumed 
power z a = 2 to a common value, s&y,r /a = 1.8. The re- 
sulting behaviour is shown by figure |3j (left panel) where 
a two-parameter fit cM* is also shown. The best fit is at 
c = 7.92(68) (10 6 MVMs), z = -2.02(10) with a x 2 per 
number of degrees of freedom of x 2 /d.o.f. = 1.8. (The re- 
sult for z remains the same if the common value of r§ /a 
is changed in the interval 1.6 < ro/a < 2.0.) 



The alternative parametrization in (Q) suggests a power 
fit as a function of m, /m p . A good fit can only be obtained 
in this case if the last point with the largest quark mass is 
omitted (see figure [|, middle panel) . The best fit param- 
eters are in this case c = 0.476(77), z = -6.03(41) with 
X 2 /d.o.f. = 1.1. The obtained power agrees very well with 
z„ p = 6 in (§) given by the CP-PACS, JLQCD Collabora- 
tion although the latter value was obtained in a range of 
substantially larger quark masses on large lattices. 

The data on the integrated autocorrelation of the small- 
est eigenvalues r™™ typically have larger errors. A fit of 
the form cM* is shown in figure [| (right panel) where 
c = 5.36(80), z = -1.48(25) with xVd.cf = 2.4. A fit 
to the integrated autocorrelation of the pion mass r™^ 
gives similar parameters: c = 1.99(16), z — —1.47(16) 
with x 2 /d.o.f = 1.7. This shows that for t™« and r™ 4 " 
the quark mass dependence is described by z^ ~ 3 which 
is, of course, more favorable than z^ ~ 4 for rf^ q . 

Concerning the quality of fits one has to remark that 
the different points belong to individually different opti- 
mizations of the polynomial parameters which have not 
necessarily the same quality. This implies an additional 
fluctuation beyond statistics. In view of this the x 2 per 
number of degrees of freedom values are reasonably good. 



4 Eigenvalue spectra 

The eigenvalue spectrum of the Wilson-Dirac matrix is 
interesting both physically and from the point of view of 
simulation algorithms. From the physical point of view 
the low-lying eigenvalues are expected to dominate the 
hadron correlators |^8|,^9| and carry information about 
the topological content of the background gauge field [[50| 
[H],[52). Although, as already stressed, in the present work 
we consider rather coarse lattices, given the importance 
of the question, it is interesting to see the effect of the 
determinant of light quarks on the qualitative properties 
of the eigenvalue spectrum. From the algorithmic point of 
view the knowledge of low-lying eigenvalues is crucial for 
the optimization of polynomial approximations. Finally, 
since we plan to perform simulations with an odd number 
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Figure 3. Left panel. Power fit of the plaquette autocorrelation given in units of 10 • MVM as a function of the dimensionless 
quark mass parameter M r . The best fit of the form cM* is at c = 7.92(68), z = —2.02(10). Middle panel. The same as a function 
of m w /m p . In this case the last point is omitted from the fit. The best fit of the form c(m^-/m p ) 2 is at c = 0.476(77), z = 
—6.03(41). Right panel. Power fit of autocorrelation for the smallest eigenvalue of Q 2 given in units of 10 6 • MVM as a function 
of the dimensionless quark mass parameter M r . The best fit of the form cM* is at c = 5.36(80), z — —1.48(25). 



of flavors |33|] , we have to consider the possibility of nega- 
tive (real) eigenvalues of the non-hermitean quark matrix 
Q, which would imply a negative determinant for a single 
quark flavour. For Nf — 2 the square of the determinant 
is relevant therefore the sign does not matter but the ab- 
sence (or statistical insignificance) of negative eigenvalues 
at very small quark masses would strongly support the 
assumption that for the heavier strange quark there will 
be no problem with the determinant sign. 

In order to study the low-lying spectrum of the eigen- 
values we used two methods: for the eigenvalues of the her- 
mitean fcrmion matrix with small absolute value the one 
by Kalkreuther and Simma [B4| and for the small eigenval- 
ues of the non-hermitean matrix the Arnoldi method p5| , 
. The determination of the eigenvalues of the hermitean 
matrix is in general much faster. However, the spectrum 
of the non-hermitean fermion matrix contains more infor- 
mations. First of all, the eigenvalues of Q depend trivially 
on the valence hopping parameter K va i because 

Q = 1 - Kva iD . (44) 

This is not true for Q. Moreover, because of the symme- 
tries 

Q ] = 75Q75 , ODO = -D , (45) 

where O xy = (-l)( xl + x2 + x3+x ^S xv , the spectrum of D 
is invariant under complex conjugation and sign change. 
As a consequence, it is sufficient to compute the low-lying 
spectrum of Q at an arbitrary value K va i = R, va i- Other 
K va i are easily obtained by a shift. The value of R va i is 
chosen such that it gives the best compromise of compu- 
tation time and precision. 

It turned out that the application of the Arnoldi al- 
gorithm is more efficient on the even-odd preconditioned 
matrix Q than on Q itself. The analytic relation between 
the eigenvalues of Q and Q can be used to transform the 



result back to Q. Indeed if Q is written in the form 

Q = l- K (° oe D ^ (46) 
then Q is given by 

^-""(SaJL) ■ (47) 

If v = (v e ,v ) is an eigenvector of Q with eigenvalues A 
then it satisfies 

(Xv e , Xv a ) = {v e - KD eo v 0l v - nD oe v e ) (48) 

and hence 

(1 - K 2 D oe D eo )v = Vo -(l- X) 2 v = A(2 - X)v . (49) 

As a result, the eigenvalues of Q are either 1 (in the even 
subspace) or they satisfy 

X = A(2 - A) . (50) 

Because of the symmetries mentioned above, the solutions 
of (^0|) will give precisely all the eigenvalues of the matrix 
Q. This relation also gives a possibility to perform a non- 
trivial check of the Arnoldi code. (In addition, we also 
compared the algorithm with a direct computation of all 
the eigenvalues on a small 4 4 lattice by means of a NAG 
library routine.) All checks confirmed the high precision 
given as an output by the ARPACK code, which was, in 
our cases, always better than 10~ 4 (relative precision). 

4.1 Small eigenvalues 

As a first task we computed the low-lying eigenvalues from 
sample sets of 10 configurations for runs in decreasing or- 
der of quark masses, namely those labeled with (a) and 
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(c)-(j) in table |^. In order to have a better access to 
the most interesting regions of the spectrum we analyzed 
each configuration from two different points of view. For 
each configuration we first determined the 150 eigenval- 
ues of the preconditioned Wilson-Dirac matrix (Q) with 
smallest modulus and then the 50 eigenvalues of the non- 
preconditioned one (Q) with smallest real part. 

Both computations were performed at an auxiliary value 
of K va i = 0.170, where the Arnoldi algorithm performed 
better. By using the analytical relations (Q) and ( |50| ) we 
transformed the eigenvalues to those of Q at the k value 
of the dynamical updates (k = n se a)- The results are plot- 
ted in figure |]. The dashed vertical line shows the limit 
for the computation of the eigenvalues with smallest real 
part: only the part of the spectrum to the left of this line 
is known. In a similar way, by computing the eigenvalues 
with smallest modulus, we have access to the portion of 
the spectrum inside the dashed circle. The circle is de- 
formed and not centered at the origin because it has been 
transformed together with the eigenvalues by using (|50[). 
In summary, the spectrum is not known in those points 
of the complex plane which are both to the right of the 
vertical line and outside the circle. 

Since the sequence from (a) to (j) corresponds to de- 
creasing quark masses it is not a surprise that the eigen- 
values have an increasing tendency to go to the left in the 
complex plane. At the same time they are pushed away 
from zero as an effect of including the fermionic determi- 
nant in the path integral measure. At very small quark 
masses a pronounced hole near zero is developing. For the 
continuum Dirac operator the spectrum is on a vertical 
line with some gap near zero. On our coarse lattices there 
is an additional horizontal spread of the eigenvalues and 
the picture is strongly deformed. 

The size of the holes produced by the determinant 
is very important if we have in mind the possibility of 
computing observables at a partially quenched K va i higher 
than n sea used in the update. The distance between the 
origin and the smallest real eigenvalue determines how 
much smaller masses (larger n va i) one can reach by par- 
tial quenching before encountering exceptional configura- 
tions. This question can be answered by studying config- 
urations with exceptionally small eigenvalu es. Of course, 
the reweighting factors discussed in section 3.2.2 have to 
be taken into account in this analysis because they sup- 
press such configurations to a large extent. 



4.2 Negative eigenvalues 

One of the purposes of the analysis of the eigenvalues was 
to determine whether there is a statistically significant 
presence of configurations with negative determinant. As 
already said, the sign of the determinant is easy to deter- 
mine from the low-lying spectrum since it is negative if an 
odd number of real negative eigenvalues occurs. In fact, 
non-real eigenvalues always appear in conjugate pairs. In 
the randomly chosen set of configurations reported above 
we did not find a single real negative eigenvalue. However, 
a set of O(10) configurations is a rather small subsample. 



Additional information on the presence or absence of 
negative eigenvalues in our samples is given by the distri- 
bution of reweighting factors. Crossing of eigenvalues to 
negative real axis implies small reweighting factors corre- 
sponding to very small eigenvalues of Q 2 below the lower 
bound of the interval [e, A]. The calculation of reweighting 
factors, which was carried out on every configuration in 
the selected subsamples, is much cheaper than the analy- 
sis of small eigenvalues of the non-hermitean matrix Q. As 
we discussed in section 3.2.2, the distribution of reweight- 
ing factors is strongly peaked near one in all runs, except 
for runs (h) and (i) which have high statistics at very small 
quark masses (see figure [j]) . In these cases there are a few 
configurations with reweighting factors close to zero. In 
order to see whether the small reweighting factors (and 
the corresponding small eigenvalues of Q 2 ) are associated 
to negative eigenvalues or not, we concentrated on config- 
urations with particularly small eigenvalues of Q 2 . 

Note that there is no simple analytical relation be- 
tween the lowest eigenvalues of the hermitean and the 
non-hermitean matrix but it is reasonable to expect that 
small eigenvalues occur together. This expectation was 
confirmed in all cases we investigated. An interesting ob- 
servation was that very small eigenvalues of the hermitean 
matrix seem to be usually associated to small real eigen- 
values of the non-hermitean one. This is compatible with 
the fact that real eigenvalues do not need to be double de- 
generate and therefore they can afford to approach closer 
to the origin than a complex conjugate pair. 

In figure || two significant examples are reported. The 
first set of configurations in the figure corresponds to a 
moderately small quark mass (run (/i)) and the second to 
a very small quark mass (run (i)). In both cases we se- 
lected the configurations with smallest eigenvalues of Q 2 . 
Even in this way we could not find a single real nega- 
tive eigenvalue for the first run (h). In the second case 
we found three configurations with negative eigenvalues. 
The (in total) four negative eigenvalues are visible in the 
detail in the right panel of figure ||. As stated before these 
configurations are statistically insignificant. 

Some comments are in order. We have collected strong 
evidence that the presence of configurations with negative 
determinant is irrelevant at this stage. Of course it is not 
yet possible to tell how this picture will evolve on larger 
volumes and closer to the continuum limit. It will be nec- 
essary to keep monitoring the low part of the spectrum as 
we did here. Since, to that purpose, we only need to know 
a very small part of the spectrum, there is no reason to 
think that this task should become too difficult on large 
volumes. 

As a last remark we should stress that we performed 
this analysis of the sign for very small quark masses. Even 
if (partially quenched) Chiral Perturbation Theory is valid 
for any combination of the quark masses, it is probably not 
worth having an unpaired sea quark with a mass much 
smaller than the strange quark. Therefore, provided that 
the picture will not dramatically change on larger lattices, 
for all physical circumstances it seems very unlikely that 
the determinant sign could become a problem. 
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Figure 4. Low- lying eigenvalues from a set of O(10) configurations for runs (a), (c) to (j). 



4.3 Flow of eigenvalues 

By using the algorithm of Kalkreuther and Simma p4j 
we also explored the flow of the spectrum {A} of the her- 
mitean matrix Q for a wide range of valence n values, going 
from zero bare quark mass to a large negative one. This 
is interesting in view of simulations of dynamical fermions 
with Neuberger's operator J37j, where the inverse square 
root of Q 2 with negative valence mass has to be taken. The 
optimal valence mass should be chosen in a region where Q 
has no eigenvalues extremely close to zero, namely where 
a "gap" is opening up in the spectrum near A = 0. The re- 
sults for two typical configurations are plotted in figure |[ 
For large negative masses we observed many sign changes 



and the eigenvalue with smallest absolute value is always 
close to zero. It seems that for dynamical Wilson fermions 
on our coarse lattice there is no gap-opening near A = 0. 



A possible application of the eigenvalue flow is to moni- 
tor the number of negative eigenvalues at n — n sea J38|,p^|. 
This is substantially cheaper than the analysis of the spec- 
trum of non-hermitean matrix Q by the Arnoldi method. 
For instance, observing the eigenvalue flow one can eas- 
ily exclude the absence of negative eigenvalues if there is 
no crossing of zero in the flow below K sea - which is the 
typical case. A more detailed (and more expensive) anal- 
ysis can be restricted to the seldom case when a crossing 
occurs. 
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5 Discussion 

Our runs on 8 3 • 16 lattices with lattice spacing of about 
a ~ 0.27 fm for Nj = 2 degenerate quarks display the de- 
pendence of simulation costs on the quark mass. Assuming 
the parametrization in (lj) with zl = 5 and z a = 2, from 
the integrated autocorrelation of the average plaquette we 
obtain 



F ~ 0.8 • 10 9 flop 



(51) 



The power for the quark mass dependence z^ comes out 
smaller than z vp = 6 in the form d) quoted by the CP- 
PACS, JLQCD Collaboration || but if we omit the point 
with largest quark mass and perform a fit with the para- 
metrization (g) we also obtain z^ p ~ 6 (see figure ||). 

As shown by figure |^ (left panel) , our data on the inte- 
grated autocorrelation of the average plaquette are well 
fitted by z T = 4 in the whole range 0.6 < M r < 6 
which approximately corresponds to im s < m u d < 2m s . 



The data on the integrated autocorrelation of the small- 
est eigenvalue of the squared hermitean fermion matrix 
show an even weaker power z„ ~ 3 but there the errors 
are larger and the fit is less convincing (see right panel 
in figure ||). The pion mass has the shortest autocorrela- 
tion; this also shows a power z n ~ 3. z^ — 4 corresponds 
to a behaviour proportional to the inverse square of the 
quark mass. Qualitatively speaking, according to table ^, 
one inverse quark mass power is due to the increase of 
the condition number of the fermion matrix and another 
inverse power comes from the increase of the autocorre- 
lation in numbers of update cycles. Note that because of 
{ro m ir) 2 oc {rom q ) the case of z n = 4, z a — 2 corresponds 
to a situation when the scale parameter ro cancels in the 
cost formula (Q). 

The overall factor F given in ( |5l| ) is such that for our 
second smallest quark mass M r ~ 0.6 (run (i)) the cost in 
floating point operations comes out to be C ~ 2.3 • 10 14 . 
As table ^ shows, considering instead of the integrated 
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autocorrelation of the average plaquette the one of the 
pion mass, the result is C ~ 0.4 • 10 14 . The parameters in 
(H) give the same number. The other estimates for Wilson- 
type quarks in § and 0] in this point are: Cl — 0.2 • 10 14 
and Cw — 1-1 • 10 14 , respectively. Taking into account 
that the numbers Cu,l,w have been obtained under rather 
different circumstances concerning simulation algorithm, 
autocorrelations considered, quark mass range, lattice size 
and even lattice action there is a surprisingly good order 
of magnitude agreement. 

It is remarkable that in a rather broad range of quark 
< "Tiud < m a (leaving out the point at m u d ~ 



2m s ) two fits with 



= 4 and 



•"Kp 



= 6 work equally 



well (figure |3|) . This implies in this range of quark masses 
a peculiar dependence of m^/rrip on M r (see figure 0, 
where the relation between the two different quark mass 
parameters \x r and M r is also shown). However, the two 
parametrizations in (^) and (^]) cannot be both correct in 
the vicinity of zero quark mass because there the two pow- 
ers have to be equal: z^ p = z^. Saying it differently, the 
extrapolations of the two fits below m u d — \"ms are dif- 
ferent. The fit with z np — 6 gives a more dramatic slowing 
down near zero quark mass than the one with z^ = 4. The 
real asymptotics near m u d — could be disentangled by 
going to still smaller quark masses. With TSMB there is 
no serious obstacle for doing this - except for the increase 
in necessary computer time. 

The value of the lattice spacing in this paper is chosen 
rather large in order to limit the computational costs for 
these tests. Our aim was to concentrate on the quark mass 
dependence in the range of light quarks. Further studies 
will be needed for investigating the cost as a function of 
the lattice spacing (in particular, the value of the expo- 
nent z a ) for smaller values of a. In this respect the expe- 
rience of the DESY-Munster-Roma Collaboration in the 
supersymmetric Yang-Mills theory at much smaller lattice 
spacings (a ~ 0.06fm) p9|PQl shows already that TSMB 
has a decent behaviour alsocloscr to the continuum limit. 

Besides the quark mass dependence of simulation costs, 
the other interesting question we investigated in this paper 
is the distribution of the small eigenvalues of the fermion 
matrix and, in particular, the existence of negative fermion 
determinants of a single quark flavour. Our data show that 
the effect of the fermion determinant is rather explicit be- 
cause of the strong suppression of the eigenvalue density 
near zero (see figures 0|q). The statistical weight of con- 
figurations with negative determinant is negligible even 
at our smallest quark masses. In fact after an extensive 
analysis we only found three configurations with negative 
determinant at our second smallest quark mass M r ~ 0.6 
(run (z)) and none of them at other quark masses. Taking 
into account the small reweighting factors of the configura- 
tions with negative determinant, their relative statistical 
weight is 0(1O~ 5 ). 

It is clear that it would be important to check the vol- 
ume dependence of our results, both for simulation costs 
and small eigenvalues, on larger lattices and closer to the 
continuum limit. We plan to do this in the future. 
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